Modeling the masking of formant transitions in noise
نویسندگان
چکیده
Formant transitions are critical for identifying the place of articulation for consonants. If these transitions are masked by background noise, perceptual confusions can occur. To better understand the masking of formant transitions, masking thresholds were measured for tone glides and single-formant trajectories of varying frequency extent (0-3 ERBs), duration (10, 30 and 100 ms), and center frequency (.5, 1.5, 3.5 kHz). Results show that thresholds are independent of frequency extent and only depend on the duration and center-frequency of the transition. A novel, time-frequency detection model, fit to previous noise-in-noise masking experiments (JASA 101, 2789-802 (1997)), is proposed which can predict these data. Although there have been several studies on the noise-masking of stationary signals such as tones (e.g. Garner and Miller, 1947; Plomp and Bouman, 1959), few studies have measured the masking of non-stationary stimuli. Collins and Cullen (1978) measured the masked thresholds of both rising and falling tone glides with frequency extents of 200 to 700 Hz and 1200 to 1700 Hz, and durations between 10 and 120 ms. Tone thresholds were about 4 dB lower than thresholds for glides and between durations of 10 and 35 ms, rising glides were more detectable than falling glides. Nabelek (1978) measured glide thresholds over a wider range of frequency extents and durations and only found substantial differences between glide and tone thresholds at the largest frequency extents and shortest durations. None of these studies, however, has specifically measured the masking of formant transitions. It is not clear whether the thresholds for these multiple-harmonic stimuli are similar to those of glides. Further, no model has been developed to predict the noise-masking of any type of non-stationary stimuli. With this in mind, masking experiments were conducted using glides and formant-transitions of varying center-frequency, duration, and frequency extent. A novel, time-frequency detection model was developed that can predict the masked thresholds of these non-stationary stimuli. Traditional models of simultaneous masking (e.g. Fletcher, 1940, Patterson, 1976) have focused on the masking of long-duration, narrowband stimuli. In these models, the signal and noise are filtered through the " optimal " auditory filter centered around the signal's center frequency and if the filtered SNR is greater than a certain threshold, then the sound is heard. However for glides and formant transitions, the " optimal " filter is constantly changing, and thus, there may need to be a mechanism that combines information across multiple filter outputs. One …
منابع مشابه
Masking of vowel-analog transitions by vowel-analog distracters
Single-formant dynamically changing harmonic vowel analogs, a target with a single frequency excursion and a longer distracter with a different fundamental frequency and repeated excursions were generated to assess informational and energetic masking of target transitions in young and elderly listeners. Results indicate the presence of informational masking that is significant only for formant ...
متن کاملPredicting the perceptual confusion of synthetic plosive consonants in noise
In previous work, a novel, time/frequency detection model was developed based on psychoacoustic masking experiments and used to predict the noise masking of speech-like bursts and formant transitions [5]. In this paper, the same model is used to predict the discrimination of voiced synthetic plosive consonants in a variety of noisy environments. Discrimination experiments were conducted using s...
متن کاملSusceptibility to intraspeech spread of masking in listeners with sensorineural hearing loss.
Previous research with speechlike signals has suggested that upward spread of masking from the first formant (F 1) may interfere with the identification of place of articulation information signaled by changes in the upper formants. This suggestion was tested by presenting two-formant stop consonant--vowel syllables varying along a/ba--/da/--/ga/ continuum to hearing-impaired listeners grouped ...
متن کاملA psychoacoustic-masking model to predict the perception of speech-like stimuli in noise
In this paper, a time/frequency, multi-look masking model is proposed to predict the detection and discrimination of speech-like stimuli in a variety of noise environments. In the first stage of the model, sound is processed through an auditory front end which includes bandpass filtering, squaring, time windowing, logarithmic compression and additive internal noise. The result is an internal re...
متن کاملPsychoacoustic and phoneme identification measures in cochlear-implant and normal-hearing listeners.
The purpose of this study is to identify precise and repeatable measures for assessing cochlear-implant (CI) hearing. The study presents psychoacoustic and phoneme identification measures in CI and normal-hearing (NH) listeners, with correlations between measures examined. Psychoacoustic measures included pitch discrimination tasks using pure tones, harmonic complexes, and tone pips; intensity ...
متن کامل